• The Grain Matrix is a spreadsheet that captures the levels of reporting for each subjectThe Grain Matrix is a spreadsheet that captures the levels of reporting for each subject area or measurement. It is the spreadsheet view
area or measurement. It is the spreadsheet view of a dimensional model.of a dimensional model.
Chapter 12: What is the Data Model Chapter 12: What is the Data Model Scorecard®?
Scorecard®?
Overview Overview
Pay now, not later Pay now, not later
Get the data model right Get the data model right Sleep better at night Sleep better at night
A frequently overlooked aspect of data quality management is that of data model quality. We A frequently overlooked aspect of data quality management is that of data model quality. We often build data models quickly, in
often build data models quickly, in the midst of a development project, the midst of a development project, and with the singularand with the singular goal of database design.
goal of database design. Yet the implications of those models are far-reaching and long-Yet the implications of those models are far-reaching and long-lasting. They affect the structure of implemented data, the ability to adapt to change, lasting. They affect the structure of implemented data, the ability to adapt to change,
understanding of and communication about data, definition of data quality rules, and much understanding of and communication about data, definition of data quality rules, and much more. In many ways, high-quality data
more. In many ways, high-quality data begins with high-quality data models. Therefore,begins with high-quality data models. Therefore, because a good data model can lead to a good
because a good data model can lead to a good application, and similarly, a bad data modelapplication, and similarly, a bad data model can lead to a bad application, we need an objective way of measuring what is good or bad can lead to a bad application, we need an objective way of measuring what is good or bad about the model. After reviewing hundreds
about the model. After reviewing hundreds of data models, I formalized the criteria I haof data models, I formalized the criteria I haveve been using into what I call the Data Model Scorecard®. This chapter will explain the Data been using into what I call the Data Model Scorecard®. This chapter will explain the Data
Model Scorecard® and its ten categories.
Model Scorecard® and its ten categories.
Data Model Scorecard® Explained Data Model Scorecard® Explained
The Scorecard is shown in table 12.1. Each of the 10 categories has a total score that relates to The Scorecard is shown in table 12.1. Each of the 10 categories has a total score that relates to the value your organization
the value your organization places on the question. Just as in anplaces on the question. Just as in any assessment, the total must bey assessment, the total must be 100.
100.
Table
Table 12.1:12.1: DataData ModelModel Scorecard®Scorecard® templatetemplate Open
Opentabletableasasspreadsheetspreadsheet
# Category
# Category
Total Total score score
Model Model score
score % % CommentsComments 1
1 How wHow well ell do tdo the characterishe characteristics tics of of the mthe model odel supportsupport the type of model?
the type of model?
10 10 2
2 How How well well does does the the model model capture capture the the requirements? requirements? 1515
Table
Table 12.1:12.1: DataData ModelModel Scorecard®Scorecard® templatetemplate Open
Opentabletableasasspreadsheetspreadsheet
# Category
5 How How well well does does the the model model leveraleverage ge generigeneric c strucstructurestures?? 1010 6
TOTAL SCORE SCORE 100100
The model score column co
The model score column contains the results of how a particular model did, ntains the results of how a particular model did, with a maximumwith a maximum score being the value that appears in the total score column. For example, if a model received 10 score being the value that appears in the total score column. For example, if a model received 10 on "
on "How well does the model capture the requirements?How well does the model capture the requirements?" then you would put 10 in this column." then you would put 10 in this column.
The % column presents the Model Score for the category divided by the Total Score for the The % column presents the Model Score for the category divided by the Total Score for the category. For example, receiving 10 out of 15 would lead to 66%. In the comments column, category. For example, receiving 10 out of 15 would lead to 66%. In the comments column, place any pertinent information that explains the score in more detail or captures the action items place any pertinent information that explains the score in more detail or captures the action items
on what is required to fix the
on what is required to fix the model. The last row contains the overall model. The last row contains the overall score assigned to thescore assigned to the model, a sum of each of the columns.
model, a sum of each of the columns.
Table 12.2 includes an example of the Scorecard template filled in.
Table 12.2 includes an example of the Scorecard template filled in.
Table
Table 12.2:12.2: DataData ModelModel Scorecard®Scorecard® exampleexample Open
Opentabletableasasspreadsheetspreadsheet
# Category model support the type of model?
model support the type of model?
10
10 10 10 100%100% Lots Lots of of procesprocessing sing datadata elements
elements
Table
Table 12.2:12.2: DataData ModelModel Scorecard®Scorecard® exampleexample Open
Opentabletableasasspreadsheetspreadsheet
# Category
10 How well does the metadata mHow well does the metadata match theatch the data?
data?
10
10 10 10 100%100% Handles Handles changinchanging g naturnaturalal account numbers
account numbers TOTAL
TOTAL SCORE SCORE 100 100 9191
The model that was reviewed in this example received a score of 91. Category 4 is a strong The model that was reviewed in this example received a score of 91. Category 4 is a strong
candidate for improvement and categories 6 and 7 also contain areas that could be improved. It is candidate for improvement and categories 6 and 7 also contain areas that could be improved. It is useful to provide a document that explains the results in more detail, to accompany a completed useful to provide a document that explains the results in more detail, to accompany a completed scorecard. In the example from Table 12.2, this document was over 50 pages in length. Both scorecard. In the example from Table 12.2, this document was over 50 pages in length. Both strengths and areas for improvement are explained
strengths and areas for improvement are explained in detail through a complete set oin detail through a complete set orr representative set of examples. For example, Category
representative set of examples. For example, Category 2 lost some points because this model ha2 lost some points because this model hadd several incorrect alternate keys. In the accompanying document, those entities with suspect
several incorrect alternate keys. In the accompanying document, those entities with suspect alternate keys are captured.
alternate keys are captured.
Feel free to use the S
Feel free to use the Scorecard on your own projects corecard on your own projects — just please add this reference:— just please add this reference:
•
• Steve Hoberman & Associates, LLC hereby grants to organizations a non-exclusiveSteve Hoberman & Associates, LLC hereby grants to organizations a non-exclusive royalty-free limited use license to use the Data Model Scorecard
royalty-free limited use license to use the Data Model Scorecard ®®solely for internalsolely for internal data model improvement purposes. The name 'Steve Hoberman & Associates, LLC' and data model improvement purposes. The name 'Steve Hoberman & Associates, LLC' and the website '
the website ' www.stevehoberman.comwww.stevehoberman.com' must appear on every document referencing the' must appear on every document referencing the Data Model Sc
Data Model Scorecard orecard ®®. Organizations have no right to sublicense the Data Model. Organizations have no right to sublicense the Data Model Scorecard
Scorecard ®® and no right to use the Data Model Scorecard and no right to use the Data Model Scorecard ®® for any pur for any purposes outsposes outside ofide of the organization's business.
the organization's business.
•
• The Scorecard starts by assuming the model is perfect.The Scorecard starts by assuming the model is perfect. As analysts, we sometimes As analysts, we sometimes notice immediately what is wrong. This can lead to quickly pointing out the negatives in notice immediately what is wrong. This can lead to quickly pointing out the negatives in designs, which in turn, can make us blind to what is good in the model, causing conflict designs, which in turn, can make us blind to what is good in the model, causing conflict or hard feelings among project team members. The Scorecard starts off with a perfect or hard feelings among project team members. The Scorecard starts off with a perfect score of 100. We then subtract points from this score for categories where we identify score of 100. We then subtract points from this score for categories where we identify areas that need improvement.
areas that need improvement.
•
• The Scorecard is objective and externally-defined.The Scorecard is objective and externally-defined. I have participated in model I have participated in model reviews where modelers take the review pe
reviews where modelers take the review personally and comments take the form of "Irsonally and comments take the form of "I don't like what you did
don't like what you did here…" or "You are still not getting here…" or "You are still not getting this structure right…" Wethis structure right…" We need to step back from the 'I' and 'You', and critique the model with an external and need to step back from the 'I' and 'You', and critique the model with an external and objective perspective. Team rapport remains intact as you, the reviewer, are not objective perspective. Team rapport remains intact as you, the reviewer, are not criticizing their model, but rather evaluating how
criticizing their model, but rather evaluating how well the model meets pre-definedwell the model meets pre-defined objectives, using an external scale to indicate areas for improvement.
objectives, using an external scale to indicate areas for improvement.
•
• The Scorecard is easy to apply and standardize.The Scorecard is easy to apply and standardize. The Scorecard was designed to enable The Scorecard was designed to enable even those new to modeling to critique their own models and the models of their
even those new to modeling to critique their own models and the models of their
colleagues. It should be incorporated into your methodology as a final checkpoint before colleagues. It should be incorporated into your methodology as a final checkpoint before the model is considered complete.
the model is considered complete.
Let's explore each of these ten categories in more detail.
Let's explore each of these ten categories in more detail.