What Constitutes High Quality Data and How Does It Improve Business?
The advent of digital technology has changed the way certain terms are used. This includes “High Quality Data,” which has been especially prevalent in recent years. Despite its frequent use, there’s a lot of misinformation about data. Terms like “data,” “information,” and “knowledge” are often used interchangeably so many people don’t realize they have different meanings.
Data refers only to numbers, images, and words that have not yet been organized or analyzed in response to a specific question (which is why it’s often referred to as “raw”). Once that raw data is organized and analyzed, it becomes information. Finally, once that information is added to what an individual already knows, it becomes part of their knowledge.
Although data refers to the basic building blocks of information and knowledge, we can still draw a distinction between low and high quality data.
Low quality data often results from human error or misunderstanding, along with improper collection and interpretation. Often, there is also competition between departments, such as IT, marketing, finance, and human resources, about who should collect data, how they should do it, and how it should be interpreted or used. This fragmentation prevents the smooth and orchestrated collection of quality data.
The problem isn’t new. As early as 1996, Wand and Wang reported that data quality problems were increasingly prevalent. Of particular note were problems related to warehousing and interpretation. Strong wrote in 1997 that poor quality data could have serious social and economic implications. Studies related to the quality of data have mainly focused on the intrinsic value of the data, with little research being done on its extrinsic value—things like usability, practicality, and generalization.
Assessing Data Quality
What, then, constitutes “high quality data?” Studies show that high quality data has eight characteristics:
The information that goes into a database must be perfect, meaning that both the method of collection and what is collected must be flawless. Accurate data assists companies in making well-informed decisions about expansions, the goods they stock and offer, organizational procedures, promotions, goals, and downsizing decisions. As this list shows, data that is captured once can be repurposed for multiple uses.
An example of a use case would be when an inspector in inspecting a piece of equipment, his electronic forms only show him the information associated to that particular piece or type of equipment. This can be delivered through dropdown lists or pre-populated fields.
Data should be captured and utilized within the parameters of relevant requirements. It is said to be valid when its use and collection are within the proper application of stated rules or definitions. In other words, valid data measures exactly what it was intended to measure. Data validity ensures its consistency between periods and across similar businesses.
Better data validity is achieved through the use of electronic forms on the computer and mobile devices. Not allowing invalid data in the first place, at the time of entry, ensures the best possible information for reporting.
Partial information is incomplete and, therefore, does not paint an accurate picture of the situation. It’s like describing an elephant while looking only at its foot, diagnosing a patient without ordering all the pertinent tests, or completing a home study by merely viewing the realtor’s virtual tour. Incomplete information may as well be incorrect, since basing decisions on a partial picture of the situation can be misleading.
Oops! You forgot to fill in the description field! Not allowing an employee to submit incomplete work orders ensures all supervisors and mechanics have the information they need in order to do their job right the first time.
Data should reflect stable and consistent collection processes at all collection points and across a given time span. If the collection process is consistent, the information collected is reliable. To ensure reliability, the data sources should be clearly identified and the method used to capture the data should be available and succinctly outlined so anyone could replicate it under similar circumstances.
Don’t have an internet connection? No problem. It’s increasingly important that your mobile apps allow data to be stored offline. This enables faster response time and minimizes data loss. Just sync when you get a connection.
Data captured must be meaningful relative to some specific purpose or question. Periodic reviews of the requirements ensure that the data collected reflects changing needs. Data is irrelevant when it isn’t current, doesn’t address the target question, or doesn’t include relevant stakeholders or service providers.
Why not add a picture to your inspection form? Show the shop mechanic exactly what you’re talking about to minimize expensive equipment downtime.
Consistency means that the same data is collected in the same way from each source and is entered in the same format and in the same location. Consistency is key when entering information into a database. For example, a database may be set up to collect phone numbers in ten-digit lengths. A number with more or fewer than ten digits will not be accepted and entered into the database because it is inconsistent with the rest of the data.
Businesses are looking to improve over time. The ability to benchmark different aspects of your business are key factors to this improvement. Because everybody is using the exact same form for capturing a work process, you’re able to see and measure what’s working and what isn’t working. Certain brands or pieces of equipment breaking down more than others? Take educated action.
In order for data to be considered unique it must be distinctive. Generic data is available to everyone but unique data is held only by a single company. Unique data is a resource a company holds that none of its competitors do, which makes it competitive in its field.
Every asset in your business gets its own unique ID number. Track the lifecycle of every asset in your business to ensure you’re receiving maximum value from each one.
Data needs to be new, current, and up to date. Dated data can hold a company back in its productivity because they make decisions based on scenarios and figures which no longer reflect reality. Modern technology has made it easy to collect real-time data, allowing companies to respond to conditions as they currently are, not as they are predicted to be based on information compiled at an earlier date.
Know what needs to be done when it needs to be done. So you have 100 things on your list…what is the most important? Using a system to help with timely prioritized information will propel your organization to the next level.
Investing in Quality Data
Even when data is of high quality, it doesn’t guarantee good returns on investment.
If the conditions are very stable over time or the decisions are insignificant, for instance, relying on low quality data might not be so bad. In those cases, collecting high quality data would be an unwise use of time and resources. This is why the first question you should ask before collecting data is what purpose it will serve. Answering that question could save you the trouble of devising an intricate data collection system and double checking its figures.
This and other data quality misunderstandings and mistakes aren’t cheap. In a 2010 study, Forbes estimated that companies spend $5 million annually on data collection problems. Twenty percent of companies surveyed estimated losses of $20 million annually for data-related mistakes.Forty percent of businesses reported a negative return on investment and twenty percent of projects are discarded because of bad data.
In light of these dismal figures, why do companies invest in data collection?
High quality data has a major impact on business. It can open new possibilities for your company. Data can change the way your company does business, the kinds of employees it looks for,and the future plans and goals it sets for itself. Companies and organizations use data they can trust to manage products, services, and staff. They use it to evaluate performance and rate the corporation’s efficiency, response to clients, and the effectiveness of what those clients produce.
In order to judge the effectiveness of companies with which they deal, service providers, customers, and the community need a way to compare them. Data provides that information.
Although we have to be careful not to waste time collecting data we don’t need, its importance can’t be overstated. Every good business decision is an informed one, and good information is based on high quality data.