Is there a data problem?
“Numbers don’t lie. Women lie, men lie, but numbers don’t lie.” – Max Holloway
Data in its simplest form may not be just numbers but it can communicate meaningful information in our lives. Take salary for example, we all know that it has to go up and when it comes down everybody notices it. Typically no one complains when it goes up but never fail to report if the numbers are down. Let us consider an hypothetical problem or perceived problem of salaries paid to employees.
The employee Gabriel in the month of April reviews his salary and realizes there is a $200 drop from his salary from January and promptly calls HR. HR reaches to IT for clarification. Usually the HR software stores the details and we can easily extract it but in this case let’s assume that the calculations are executed in the backend code and just the results are stored in the table.
IT looks at the numbers and sure enough there is a drop, let’s see what we can find. The general tendency is to assume that there is a problem in the system. With that assumption they do all querying, walking through the code applying the business rules but fail to find any smoking gun.
Then they realize after spending many hours that the system works as designed. There could be reasons outside the system that need to be validated. Meanwhile, they see some note that says the bonus is paid at the end of year but credited beginning of the year. IT reviews with the business owners and sure enough they remember bonus given in December gets credited in January. Employee is notified and he goes back and checks his December salary and it matches with February salary. However, if the same situation happens in the future the same song and dance had to be done to identify the issue because the same employees may or may not be there to support the application. So it is critical we design our systems with enough logging and adding business calculations in the system designs.
In this case, we need to have included the bonus information in the table and a total on how we arrived at employee’s final salary.
The above use case is only for demonstrative purpose to explain the concept in simple terms and real life structures are much more complex.
Bottom line, when you go about analyzing the data make sure you trace the processes step by step and understanding if the numbers jive with the previous step, documenting it until you come to the end point. We need to have an open mind without any bias, in approaching these real or perceived data issues.