All the problems touched on in this book converge on the logical model (see
Chapter 2). It is in this medium that all the things an enterprise deals with must be reduced to crisply structured descriptions.
The logical model will be a very real computer-related construct, just like a program or a data file. An enterprise is going to have a large amount of time, effort, and money invested in the logical model.
There is the learning investment. In spite of our best efforts, any formalism we adopt as the basis of the logical model will still be an artificial structure. The concepts will not be perfectly intuitive to anyone; the rules, limitations, and idiosyncrasies will have to be learned. There will be a formal language to be learned, as well as operating procedures. (Interactive facilities and other design aids may help—after the bugs get ironed out—but even their use has to be learned.)
The modeling tools today are much better than they were in 1978.
Then comes the actual modeling effort. A lot of energy will go into forcing a fit between the model and the enterprise. The correspondences won't always be obvious; there will be lots of alternatives, and it will take some iterations to recognize the best choices.
which better fits the model. Sometimes the enterprise itself will be altered to fit the model. (It's not unusual for a company to adopt a whole new part numbering scheme before automating their inventory control.) This is all accompanied by the gargantuan task of simply collecting and coordinating a mountainous heap of descriptions.
Looking at the same information landscape through the eyes of a particular project, and then through the eyes of the enterprise, often produces two very different pictures. Recall how important context is from earlier discussions.
“Many corporations will be carrying out the lengthy job over the next 10 years of defining the thousands of data-item types they use and constructing, step by step, suitable schemas from which their databases will be built. The description of this large quantity of data will be an arduous task involving much argument between different interested parties. Eventually the massive databases that develop will become one of the corporation's major assets” [Martin].
Are we there yet? James Martin wrote this in 1975. I believe most companies are I still far from completing this comprehensive and high quality type of mapping.
The end result will be a physically large volume of information. “It must be emphasized…that the logical schema is a real and tangible item made most explicit in machine readable form, couched in some well defined and potentially standardizable language” [ANSI]. Think of it in the same orders of magnitude as a program library, or a system catalog, or a payroll file. Think of cylinders of disk space, and printouts many inches thick. Think of a small army of technical personnel who have been indoctrinated in a particular way of conceptualizing data, and who have mastered the intricacies of a new language and the attendant operational procedures.
All this time, manpower, and money will be invested by customers in any logical model supported in a major system. We had better be very careful about the architecture of the first one. Any attempt to replace it with a better one later will threaten that investment; customers won't accept the replacement any faster than they now accept a major new programming language, or a new operating system. And the replacements will forever be hamstrung by compatibility and migration requirements.
So it is a very good idea to get the logical data model right the first time!
Unfortunately, there are some natural forces which work against our getting it right the first time.
We are just entering a transitional phase in data description. The idea of having three levels of data description (i.e., including a logical model) has been much researched and written about ([ANSI], [GUIDE-SHARE]), but it hasn't yet taken serious hold in any significant commercial systems. It's still on the horizon; it's an idea whose time is just about to come. (I hope I won't still be saying that ten years from now.)
I think we are still saying this! Many organizations have accurate physical data models, but are missing conceptual and logical data models, or have conceptual and logical data models that are inferior to or out-of-date as compared to their physical data model counterparts.
The builders and users of today's commercial systems quite justifiably want to avoid cluttering their systems with anything that might impair efficiency and productivity. The argument that this new approach will make the overall management of data more productive in the long run has yet to be convincingly demonstrated to them.
On many Agile development projects today, for example, the idea of a Big Design Up Front (BDUF) is frowned upon as an activity that slows down delivery of the actual system.
The need for a more sophisticated descriptive model will only gradually achieve general recognition. It will come from the headaches of trying to crunch together the diverse record formats and data structures used by growing families of applications operating on the same integrated database. The nonsense of trying to reflect all their record formats in the logical model, while still pretending that the logical model describes the entities of the enterprise, will become apparent.
The need for a more sophisticated approach to data description will also grow as the interfaces of the data systems expand to involve more people who are not trained in computer disciplines. Such people will be involved both as end users and as managers of the information resource. Someday there will be a general recognition of what it means, and what it's worth, to model entities and relationships instead of data items and records. I hope that recognition won't come too late.
Are we there yet? We are still “fighting the fight,” expending a lot of effort to convince the business of the need for and value of the logical data model.