Key Points
- Attribute data are characteristics that aren’t readily quantified by hard numbers.
- It is a discrete data type.
- Making use of it requires more data on hand, as well as a robust method of measuring.
The world of data consists of things that you measure, and things that you count. The terms attribute data and discrete data are similar but distinct enough to warrant a closer look. Let’s explore the differences so you will have a better understanding of attribute data, how to use it, what the advantages and disadvantages there might be, and some best practices in collecting and analyzing that type of data.
What Is Attribute Data?
In the world of data, there are things we measure and things we count. Data that we derive from measuring things is called continuous data. A good definition of continuous data is that it is measurable by some measuring device (e.g., stopwatch, scale, tape measure), it can take on any value across a continuum of possible values, and it can be logically subdivided.
For example, your height can be measured with a tape measure, it can take on any value between a continuum of possible values, and it can be logically subdivided into feet, inches, one-quarter inches, one-eighth inches, etc. Continuous data is valued because of the precision allowed by the logical subdivision of values.
Discrete or attribute data are things that can be counted. Discrete data can be further refined into discrete numeric data and discrete attribute data. Examples of discrete numeric data might be the number of errors on your invoice, the number of rejected parts on your manufacturing line, and the number of people on hold waiting for your customer service rep to pick up the phone.
Discrete attribute data is a little different. This type of data will assign a numeric value to some qualitative characteristic. If these qualitative characteristics have a logical order, we can refer to them as discrete ordinal data. A classic example is the Likert scale. Here we can order some attributes such as: Strongly agree, Moderately agree, Neutral, Moderately disagree, and Strongly disagree. In our survey, we would assign a numeric value to them such as 5, 4, 3, 2, or 1. We can then count the number in each category.
Putting It Together
We can also have some discrete attribute data that is not ordered. For example, we can define our attributes in terms of types of products. A glass company may categorize its products as laminated glass, tempered glass, insulated glass, and coated glass. There is no logical order or preference, they are just different. I can assign a numerical code to them and can even count the number of each.
Why does it make a difference what kind of data we have? Because the type of analytical tool we use is based on the type of data we have. As stated earlier, we prefer continuous data because it is more robust or flexible and provides a greater refinement of the data. Many people are tempted to collect continuous data and then convert it to discrete or attribute data. For example, you are collecting delivery time for each order. That would be continuous data. Unfortunately, you then converted it into a binary attribute data consisting of on-time/not on-time. By doing so, you lose a lot of information that would likely be useful in analyzing that process.
Why Quantify It?
When you’re doing any sort of data analysis, there is going to come a point where you need to sit down and look at things without hard numbers behind them. While you can certainly verify the accuracy or precision of a process’s output, how do you quantify something like the color? This is where attribute data comes in handy, it gives you additional data points for use in your analysis.
Drawbacks of Attribute Data
While it may seem it is easier to understand and apply, attribute data has many drawbacks that detract from its usefulness.
Requires More Data for Analysis
You can quickly gain insight into a process with continuous data. You need a considerably larger sample size of attribute data to understand the underlying process.
Measurement System
The basis of any good data set is the accuracy and precision of the measurement system capturing the data. Attribute data relies on a human to collect the data. This is inherently worse than what a measurement device would be.
Operational Definition
Unless there is an agreed-upon definition of the attribute you’re collecting data on, there is a strong likelihood that there may be confusion as to what you are collecting data on. What does “Strongly agree” really mean? Different people may have different definitions of this term. That has to be resolved so you can have confidence that everyone is collecting the data the same way.
Why Is It Important to Understand?
Making the distinction between attribute and continuous data and even attribute and discrete data is critical to collecting and analyzing your data.
Attribute vs. Discrete Data
While these two terms are often used interchangeably, there is sufficient difference that you must understand to properly define and collect your data.
Using the Correct Analytical Tool
Using the wrong analytical tool for the data you’ve collected can result in incorrect conclusions.
Correct Statistics
Understanding the correct tool to use is one challenge. Using the correct statistics to describe your sample and assumed population is another challenge. The purpose of any data collection is to learn about your process. The type of data you collect and use is the foundation for proper analysis.
An Industry Example
To better understand how employees feel about the company, the head of human resources distributed a survey to its employees. Several questions were asked with the possible responses being in the form of a Likert scale of Strongly Agree (5), Agree (4), Neutral (3), Disagree (2), and Strongly Disagree (1). They tabulated the results, and an HR Manager prepared a report to be distributed to senior leadership. Before dissemination, the manager asked her Master Black Belt (MBB) to review and comment on the presentation.
Unfortunately, rather than treat the data as pure attribute data, which it was, the manager chose to report out the results as if it were pure continuous data, which it was not. She added up all the numbers and calculated averages for all the categories. For example, she reported that for one critical question, the average response was 3. The MBB pointed out that the 3 could have been calculated with half the values being 1 and half being 5. Or half being 2 and half being 4. Or all the values could have been a 3. This made no sense.
In the end, the MBB convinced the manager that it would be better to present the data as attribute data and not try to treat it as continuous. That meant that the results should have been presented as the number of responses for 5, the number of responses for 4, and so forth. Plus the manager could report the values as percentages. That is, 15% of the responses were for Agree, while 25% were for Strongly Disagree. This was the better way to report the attribute data.
Best Practices
In most cases, you may delegate your statistical analysis to those more experienced and knowledgeable about statistics. In any case, you should be aware of some of the best practices so you can assess whether your experts are doing what you need them to do.
Process Stability
Process stability, or common cause variation, is assessed through the use of SPC control charts. If your process is stable, it is then predictable. Therefore, the data that you are collecting should come from a stable process.
Plot the Data
If possible, always plot the data before embarking upon any complex statistical analysis. A picture is worth a thousand words, so make use of graphs such as frequency diagrams, bar charts, and even control charts.
Computer Software
The days of doing statistical or graphical analysis are long gone. There is a plethora of software programs, both sophisticated and basic, you can use to do your analysis.
Data Collection Plan
Be sure you have a solid data collection plan that clearly defines what you’re going to collect data on, how you are going to collect it, who is going to collect it, and how much data you need to collect.
Other Useful Tools and Concepts
Looking for other ways to improve your organization? You might want to take a look at process time. This is actual time spent working on a process or specific task and can indicate whether there is waste present in the way your workers conduct themselves.
Further, you might want to explore trend analysis tools. Data is a big deal when it comes to Six Sigma, and leveraging that data to make future decisions is like having a crystal ball. While not foolproof, it can prepare you for changing customer needs and changes to the market.
Conclusion
Attribute data is a form of discrete data. It is represented by counts rather than measurements. It can be numeric, ordinal, non-ordered, and even binary. It’s generally descriptive. If your process is such that it only generates attribute data and that’s all you can collect data on, then that’s what you have to work with. Do not convert continuous data into attribute data.
It is important to properly define and understand the nature of your data so you can utilize the appropriate statistics and analytical tools. Using the wrong tools for the wrong type of data greatly diminishes the value of your analysis and conclusions.